Somerset
Optimal Projections for Classification with Naive Bayes
Hofmeyr, David P., Kamper, Francois, Melonas, Michail M.
In the Naive Bayes classification model the class conditional densities are estimated as the products of their marginal densities along the cardinal basis directions. We study the problem of obtaining an alternative basis for this factorisation with the objective of enhancing the discriminatory power of the associated classification model. We formulate the problem as a projection pursuit to find the optimal linear projection on which to perform classification. Optimality is determined based on the multinomial likelihood within which probabilities are estimated using the Naive Bayes factorisation of the projected data. Projection pursuit offers the added benefits of dimension reduction and visualisation. We discuss an intuitive connection with class conditional independent components analysis, and show how this is realised visually in practical applications. The performance of the resulting classification models is investigated using a large collection of (162) publicly available benchmark data sets and in comparison with relevant alternatives. We find that the proposed approach substantially outperforms other popular probabilistic discriminant analysis models and is highly competitive with Support Vector Machines.
- North America > United States > New Jersey > Somerset County > Somerset (0.04)
- Europe > United Kingdom (0.04)
- Europe > Switzerland (0.04)
- Africa > South Africa (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Counterfactual Collaborative Reasoning
Ji, Jianchao, Li, Zelong, Xu, Shuyuan, Xiong, Max, Tan, Juntao, Ge, Yingqiang, Wang, Hao, Zhang, Yongfeng
Causal reasoning and logical reasoning are two important types of reasoning abilities for human intelligence. However, their relationship has not been extensively explored under machine intelligence context. In this paper, we explore how the two reasoning abilities can be jointly modeled to enhance both accuracy and explainability of machine learning models. More specifically, by integrating two important types of reasoning ability -- counterfactual reasoning and (neural) logical reasoning -- we propose Counterfactual Collaborative Reasoning (CCR), which conducts counterfactual logic reasoning to improve the performance. In particular, we use recommender system as an example to show how CCR alleviate data scarcity, improve accuracy and enhance transparency. Technically, we leverage counterfactual reasoning to generate "difficult" counterfactual training examples for data augmentation, which -- together with the original training examples -- can enhance the model performance. Since the augmented data is model irrelevant, they can be used to enhance any model, enabling the wide applicability of the technique. Besides, most of the existing data augmentation methods focus on "implicit data augmentation" over users' implicit feedback, while our framework conducts "explicit data augmentation" over users explicit feedback based on counterfactual logic reasoning. Experiments on three real-world datasets show that CCR achieves better performance than non-augmented models and implicitly augmented models, and also improves model transparency by generating counterfactual explanations.
- Asia > Singapore > Central Region > Singapore (0.05)
- North America > United States > New Jersey > Middlesex County > New Brunswick (0.05)
- North America > United States > New York > New York County > New York City (0.04)
- (2 more...)
- Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
- Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (0.67)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.66)
A Minimalist Model of the Artificial Autonomous Moral Agent (AAMA)
Howard, Don (University of Notre Dame) | Muntean, Ioan (University of Notre Dame)
This paper proposes a model for an artificial autonomous moral agent (AAMA), which is parsimonious in its ontology and minimal in its ethical assumptions. Starting from a set of moral data, this AAMA is able to learn and develop a form of moral competency. It resembles an “optimizing predictive mind,” which uses moral data (describing typical behavior of humans) and a set of dispositional traits to learn how to classify different actions (given a given background knowledge) as morally right, wrong, or neutral. When confronted with a new situation, this AAMA is supposedly able to predict a behavior consistent with the training set. This paper argues that a promising computational tool that fits our model is “neuroevolution,” i.e. evolving artificial neural networks.
- North America > United States > New York (0.05)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.05)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- (9 more...)
Analysis of Stopping Active Learning based on Stabilizing Predictions
Bloodgood, Michael, Grothendieck, John
Within the natural language processing (NLP) community, active learning has been widely investigated and applied in order to alleviate the annotation bottleneck faced by developers of new NLP systems and technologies. This paper presents the first theoretical analysis of stopping active learning based on stabilizing predictions (SP). The analysis has revealed three elements that are central to the success of the SP method: (1) bounds on Cohen's Kappa agreement between successively trained models impose bounds on differences in F-measure performance of the models; (2) since the stop set does not have to be labeled, it can be made large in practice, helping to guarantee that the results transfer to previously unseen streams of examples at test/application time; and (3) good (low variance) sample estimates of Kappa between successive models can be obtained. Proofs of relationships between the level of Kappa agreement and the difference in performance between consecutive models are presented. Specifically, if the Kappa agreement between two models exceeds a threshold T (where $T>0$), then the difference in F-measure performance between those models is bounded above by $\frac{4(1-T)}{T}$ in all cases. If precision of the positive conjunction of the models is assumed to be $p$, then the bound can be tightened to $\frac{4(1-T)}{(p+1)T}$.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > Maryland > Prince George's County > College Park (0.14)
- North America > United States > Delaware > New Castle County > Newark (0.14)
- (18 more...)
Bucking the Trend: Large-Scale Cost-Focused Active Learning for Statistical Machine Translation
Bloodgood, Michael, Callison-Burch, Chris
We explore how to improve machine translation systems by adding more translation data in situations where we already have substantial resources. The main challenge is how to buck the trend of diminishing returns that is commonly encountered. We present an active learning-style data solicitation algorithm to meet this challenge. We test it, gathering annotations via Amazon Mechanical Turk, and find that we get an order of magnitude increase in performance rates of improvement.
- North America > United States > New York > New York County > New York City (0.14)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- (15 more...)
Efficiently Inducing Features of Conditional Random Fields
Conditional Random Fields (CRFs) are undirected graphical models, a special case of which correspond to conditionally-trained finite state machines. A key advantage of these models is their great flexibility to include a wide array of overlapping, multi-granularity, non-independent features of the input. In face of this freedom, an important question that remains is, what features should be used? This paper presents a feature induction method for CRFs. Founded on the principle of constructing only those feature conjunctions that significantly increase log-likelihood, the approach is based on that of Della Pietra et al [1997], but altered to work with conditional rather than joint probabilities, and with additional modifications for providing tractability specifically for a sequence model. In comparison with traditional approaches, automated feature induction offers both improved accuracy and more than an order of magnitude reduction in feature count; it enables the use of richer, higher-order Markov models, and offers more freedom to liberally guess about which atomic features may be relevant to a task. The induction method applies to linear-chain CRFs, as well as to more arbitrary CRF structures, also known as Relational Markov Networks [Taskar & Koller, 2002]. We present experimental results on a named entity extraction task.
- North America > United States > Massachusetts > Hampshire County > Amherst (0.14)
- North America > United States > New Jersey > Somerset County > Somerset (0.04)
- North America > United States > California > Santa Clara County > Los Altos (0.04)
- Asia > Middle East > Lebanon (0.04)
- Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.94)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.88)
Simulating Human Ratings on Word Concreteness
Feng, Shi (University of Memphis) | Cai, Zhiqiang (University of Memphis) | Crossley, Scott (Georgia State University) | McNamara, Danielle S ( University of Memphis )
However, word concreteness is not an attribute that a A single word in the human language has many complex computer can directly compute. One means of assessing dimensions such as semantics, parts of speech, lexical type, the characteristics of words is by having humans rate them imagability, concreteness, familiarity, etc. It is important to on the dimensions of interest. Humans are proficient in know the dimensions of words in languages so that we can categorizing words into linguistic dimensions, but it is develop a better theoretical understanding of language and impractical to have humans rating tens of thousands of also to build tools that simulate human intelligence and words that we would need for psycholinguistic research.
- North America > United States > New Jersey > Bergen County > Mahwah (0.04)
- North America > United States > Texas > Travis County > Austin (0.04)
- North America > United States > Tennessee > Shelby County > Memphis (0.04)
- North America > United States > New Jersey > Somerset County > Somerset (0.04)
NP Animacy Identification for Anaphora Resolution
In anaphora resolution for English, animacy identification can play an integral role in the application of agreement restrictions between pronouns and candidates, and as a result, can improve the accuracy of anaphora resolution systems. In this paper, two methods for animacy identification are proposed and evaluated using intrinsic and extrinsic measures. The first method is a rule-based one which uses information about the unique beginners in WordNet to classify NPs on the basis of their animacy. The second method relies on a machine learning algorithm which exploits a WordNet enriched with animacy information for each sense. The effect of word sense disambiguation on the two methods is also assessed. The intrinsic evaluation reveals that the machine learning method reaches human levels of performance. The extrinsic evaluation demonstrates that animacy identification can be beneficial in anaphora resolution, especially in the cases where animate entities are identified with high precision.
- Europe > United Kingdom > England > Lancashire > Lancaster (0.04)
- Europe > France (0.04)
- North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
- (12 more...)
Corpus-Based Approaches to Semantic Interpretation in NLP
In recent years, there has been a flurry of research into empirical, corpus-based learning approaches to natural language processing (NLP). Most empirical NLP work to date has focused on relatively low-level language processing such as part-of-speech tagging, text segmentation, and syntactic parsing. The success of these approaches has stimulated research in using empirical learning techniques in other facets of NLP, including semantic analysis -- uncovering the meaning of an utterance. This article is an introduction to some of the emerging research in the application of corpus-based learning techniques to problems in semantic interpretation. In particular, we focus on two important problems in semantic interpretation, namely, word-sense disambiguation and semantic parsing.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > Canada > Ontario > Toronto (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- (12 more...)
- Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.94)
Automating Knowledge Acquisition for Machine Translation
Machine translation of human languages (for example, Japanese, English, Spanish) was one of the earliest goals of computer science research, and it remains an elusive one. Like many AI tasks, trans-lation requires an immense amount of knowledge about language and the world. Recent approaches to machine translation frequently make use of text-based learning algorithms to fully or partially automate the acquisition of knowledge. This article illustrates these approaches.
- North America > United States > New Jersey > Somerset County > Somerset (0.05)
- North America > United States > District of Columbia > Washington (0.05)
- North America > United States > New York (0.05)
- (5 more...)